Elena Stein
The tasks for this EDA are to:
fig = sns.barplot(words,words_use_frac)
plt.ylabel('Fraction of times word is used')
plt.xlabel('word')
plt.savefig('Word_usage.png')
Investigating the use of the word "recovery" versus "death" over time
plt.scatter(mean_death.index.minute, mean_death)
plt.scatter(mean_recover.index.minute, mean_recover)
plt.xlabel('Minute')
plt.ylabel('Frequency')
plt.title('Mentions per Minute')
plt.legend(('deaths', 'recovery'))
plt.show()
Looking at sentiment over time for the 6th of May
sentiment_covid = sentiment.resample('1 min').mean()
plt.scatter(sentiment_covid.index.minute, sentiment_covid, )
plt.xlabel('Time (minute)')
plt.ylabel('Sentiment')
plt.title('Sentiment with time for COVID tweets')
plt.show()
This analysis investigates the evolution of Stringency Index in different countries over time, with the death rate. For information of how it is calculated see the link https://covidtracker.bsg.ox.ac.uk, we use the API data instead of the csv files.
Tasks:
import matplotlib.pyplot as plt
df_pivot = df_selected.pivot(index='date_value', columns='country_code', values=['stringency','deaths'])
df_pivot.plot(y='deaths')
plt.title('Deaths over time', fontweight='semibold')
plt.xlabel('Date')
plt.ylabel('Deaths')
plt.xlim('2020-03-01','2020-04-24')
(18322, 18376)
df_pivot.plot(y='stringency')
plt.xlabel('Date')
plt.ylabel('Stringency of measures')
plt.xlim('2020-03-01','2020-04-24')
plt.legend(loc=4)
plt.title('Stringency over time', fontweight='bold')
Text(0.5, 1.0, 'Stringency over time')
fig
Tasks:
We plot the Stringency Index of each country on a map, shaded according to the level of Stringency
m
Tasks:
Geocode the tweets based on user location description
m_2